Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 220
Filtrar
1.
JMIR Med Inform ; 12: e51171, 2024 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-38596848

RESUMO

Background: With the capability to render prediagnoses, consumer wearables have the potential to affect subsequent diagnoses and the level of care in the health care delivery setting. Despite this, postmarket surveillance of consumer wearables has been hindered by the lack of codified terms in electronic health records (EHRs) to capture wearable use. Objective: We sought to develop a weak supervision-based approach to demonstrate the feasibility and efficacy of EHR-based postmarket surveillance on consumer wearables that render atrial fibrillation (AF) prediagnoses. Methods: We applied data programming, where labeling heuristics are expressed as code-based labeling functions, to detect incidents of AF prediagnoses. A labeler model was then derived from the predictions of the labeling functions using the Snorkel framework. The labeler model was applied to clinical notes to probabilistically label them, and the labeled notes were then used as a training set to fine-tune a classifier called Clinical-Longformer. The resulting classifier identified patients with an AF prediagnosis. A retrospective cohort study was conducted, where the baseline characteristics and subsequent care patterns of patients identified by the classifier were compared against those who did not receive a prediagnosis. Results: The labeler model derived from the labeling functions showed high accuracy (0.92; F1-score=0.77) on the training set. The classifier trained on the probabilistically labeled notes accurately identified patients with an AF prediagnosis (0.95; F1-score=0.83). The cohort study conducted using the constructed system carried enough statistical power to verify the key findings of the Apple Heart Study, which enrolled a much larger number of participants, where patients who received a prediagnosis tended to be older, male, and White with higher CHA2DS2-VASc (congestive heart failure, hypertension, age ≥75 years, diabetes, stroke, vascular disease, age 65-74 years, sex category) scores (P<.001). We also made a novel discovery that patients with a prediagnosis were more likely to use anticoagulants (525/1037, 50.63% vs 5936/16,560, 35.85%) and have an eventual AF diagnosis (305/1037, 29.41% vs 262/16,560, 1.58%). At the index diagnosis, the existence of a prediagnosis did not distinguish patients based on clinical characteristics, but did correlate with anticoagulant prescription (P=.004 for apixaban and P=.01 for rivaroxaban). Conclusions: Our work establishes the feasibility and efficacy of an EHR-based surveillance system for consumer wearables that render AF prediagnoses. Further work is necessary to generalize these findings for patient populations at other sites.

2.
Lancet Digit Health ; 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38658283

RESUMO

With the rapid growth of interest in and use of large language models (LLMs) across various industries, we are facing some crucial and profound ethical concerns, especially in the medical field. The unique technical architecture and purported emergent abilities of LLMs differentiate them substantially from other artificial intelligence (AI) models and natural language processing techniques used, necessitating a nuanced understanding of LLM ethics. In this Viewpoint, we highlight ethical concerns stemming from the perspectives of users, developers, and regulators, notably focusing on data privacy and rights of use, data provenance, intellectual property contamination, and broad applications and plasticity of LLMs. A comprehensive framework and mitigating strategies will be imperative for the responsible integration of LLMs into medical practice, ensuring alignment with ethical principles and safeguarding against potential societal risks.

3.
Artigo em Inglês | MEDLINE | ID: mdl-38452298

RESUMO

OBJECTIVES: This article aims to examine how generative artificial intelligence (AI) can be adopted with the most value in health systems, in response to the Executive Order on AI. MATERIALS AND METHODS: We reviewed how technology has historically been deployed in healthcare, and evaluated recent examples of deployments of both traditional AI and generative AI (GenAI) with a lens on value. RESULTS: Traditional AI and GenAI are different technologies in terms of their capability and modes of current deployment, which have implications on value in health systems. DISCUSSION: Traditional AI when applied with a framework top-down can realize value in healthcare. GenAI in the short term when applied top-down has unclear value, but encouraging more bottom-up adoption has the potential to provide more benefit to health systems and patients. CONCLUSION: GenAI in healthcare can provide the most value for patients when health systems adapt culturally to grow with this new technology and its adoption patterns.

5.
JAMA ; 331(1): 17-18, 2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38032634

RESUMO

This Viewpoint discusses a recent executive order by US President Joe Biden about the development and implementation of AI, including the role of government vs the private sector and how the order may affect health care.


Assuntos
Inteligência Artificial , Atenção à Saúde , Atenção à Saúde/legislação & jurisprudência , Prática de Grupo/legislação & jurisprudência , Organizações/legislação & jurisprudência , Política , Governo Federal , Estados Unidos
6.
J Palliat Med ; 27(1): 83-89, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37935036

RESUMO

Background: Patients with serious illness benefit from conversations to share prognosis and explore goals and values. To address this, we implemented Ariadne Labs' Serious Illness Care Program (SICP) at Stanford Health Care. Objective: Improve quantity, timing, and quality of serious illness conversations. Methods: Initial implementation followed Ariadne Labs' SICP framework. We later incorporated a team-based approach that included nonphysician care team members. Outcomes included number of patients with documented conversations according to clinician role and practice location. Machine learning algorithms were used in some settings to identify eligible patients. Results: Ambulatory oncology and hospital medicine were our largest implementation sites, engaging 4707 and 642 unique patients in conversations, respectively. Clinicians across eight disciplines engaged in these conversations. Identified barriers that included leadership engagement, complex workflows, and patient identification. Conclusion: Several factors contributed to successful SICP implementation across clinical sites: innovative clinical workflows, machine learning based predictive algorithms, and nonphysician care team member engagement.


Assuntos
Cuidados Críticos , Estado Terminal , Humanos , Estado Terminal/terapia , Comunicação , Relações Médico-Paciente , Centros Médicos Acadêmicos
7.
JAMA ; 331(3): 245-249, 2024 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-38117493

RESUMO

Importance: Given the importance of rigorous development and evaluation standards needed of artificial intelligence (AI) models used in health care, nationwide accepted procedures to provide assurance that the use of AI is fair, appropriate, valid, effective, and safe are urgently needed. Observations: While there are several efforts to develop standards and best practices to evaluate AI, there is a gap between having such guidance and the application of such guidance to both existing and new AI models being developed. As of now, there is no publicly available, nationwide mechanism that enables objective evaluation and ongoing assessment of the consequences of using health AI models in clinical care settings. Conclusion and Relevance: The need to create a public-private partnership to support a nationwide health AI assurance labs network is outlined here. In this network, community best practices could be applied for testing health AI models to produce reports on their performance that can be widely shared for managing the lifecycle of AI models over time and across populations and sites where these models are deployed.


Assuntos
Inteligência Artificial , Atenção à Saúde , Laboratórios , Garantia da Qualidade dos Cuidados de Saúde , Qualidade da Assistência à Saúde , Inteligência Artificial/normas , Instalações de Saúde/normas , Laboratórios/normas , Parcerias Público-Privadas , Garantia da Qualidade dos Cuidados de Saúde/normas , Atenção à Saúde/normas , Qualidade da Assistência à Saúde/normas , Estados Unidos
9.
JAMA Netw Open ; 6(9): e2333495, 2023 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-37725377

RESUMO

Importance: Ranitidine, the most widely used histamine-2 receptor antagonist (H2RA), was withdrawn because of N-nitrosodimethylamine impurity in 2020. Given the worldwide exposure to this drug, the potential risk of cancer development associated with the intake of known carcinogens is an important epidemiological concern. Objective: To examine the comparative risk of cancer associated with the use of ranitidine vs other H2RAs. Design, Setting, and Participants: This new-user active comparator international network cohort study was conducted using 3 health claims and 9 electronic health record databases from the US, the United Kingdom, Germany, Spain, France, South Korea, and Taiwan. Large-scale propensity score (PS) matching was used to minimize confounding of the observed covariates with negative control outcomes. Empirical calibration was performed to account for unobserved confounding. All databases were mapped to a common data model. Database-specific estimates were combined using random-effects meta-analysis. Participants included individuals aged at least 20 years with no history of cancer who used H2RAs for more than 30 days from January 1986 to December 2020, with a 1-year washout period. Data were analyzed from April to September 2021. Exposure: The main exposure was use of ranitidine vs other H2RAs (famotidine, lafutidine, nizatidine, and roxatidine). Main Outcomes and Measures: The primary outcome was incidence of any cancer, except nonmelanoma skin cancer. Secondary outcomes included all cancer except thyroid cancer, 16 cancer subtypes, and all-cause mortality. Results: Among 1 183 999 individuals in 11 databases, 909 168 individuals (mean age, 56.1 years; 507 316 [55.8%] women) were identified as new users of ranitidine, and 274 831 individuals (mean age, 58.0 years; 145 935 [53.1%] women) were identified as new users of other H2RAs. Crude incidence rates of cancer were 14.30 events per 1000 person-years (PYs) in ranitidine users and 15.03 events per 1000 PYs among other H2RA users. After PS matching, cancer risk was similar in ranitidine compared with other H2RA users (incidence, 15.92 events per 1000 PYs vs 15.65 events per 1000 PYs; calibrated meta-analytic hazard ratio, 1.04; 95% CI, 0.97-1.12). No significant associations were found between ranitidine use and any secondary outcomes after calibration. Conclusions and Relevance: In this cohort study, ranitidine use was not associated with an increased risk of cancer compared with the use of other H2RAs. Further research is needed on the long-term association of ranitidine with cancer development.


Assuntos
Neoplasias Cutâneas , Neoplasias da Glândula Tireoide , Feminino , Humanos , Pessoa de Meia-Idade , Masculino , Ranitidina/efeitos adversos , Estudos de Coortes , Antagonistas dos Receptores H2 da Histamina/efeitos adversos
10.
JAMIA Open ; 6(3): ooad054, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37545984

RESUMO

Objective: To describe the infrastructure, tools, and services developed at Stanford Medicine to maintain its data science ecosystem and research patient data repository for clinical and translational research. Materials and Methods: The data science ecosystem, dubbed the Stanford Data Science Resources (SDSR), includes infrastructure and tools to create, search, retrieve, and analyze patient data, as well as services for data deidentification, linkage, and processing to extract high-value information from healthcare IT systems. Data are made available via self-service and concierge access, on HIPAA compliant secure computing infrastructure supported by in-depth user training. Results: The Stanford Medicine Research Data Repository (STARR) functions as the SDSR data integration point, and includes electronic medical records, clinical images, text, bedside monitoring data and HL7 messages. SDSR tools include tools for electronic phenotyping, cohort building, and a search engine for patient timelines. The SDSR supports patient data collection, reproducible research, and teaching using healthcare data, and facilitates industry collaborations and large-scale observational studies. Discussion: Research patient data repositories and their underlying data science infrastructure are essential to realizing a learning health system and advancing the mission of academic medical centers. Challenges to maintaining the SDSR include ensuring sufficient financial support while providing researchers and clinicians with maximal access to data and digital infrastructure, balancing tool development with user training, and supporting the diverse needs of users. Conclusion: Our experience maintaining the SDSR offers a case study for academic medical centers developing data science and research informatics infrastructure.

11.
JAMA ; 330(9): 866-869, 2023 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-37548965

RESUMO

Importance: There is increased interest in and potential benefits from using large language models (LLMs) in medicine. However, by simply wondering how the LLMs and the applications powered by them will reshape medicine instead of getting actively involved, the agency in shaping how these tools can be used in medicine is lost. Observations: Applications powered by LLMs are increasingly used to perform medical tasks without the underlying language model being trained on medical records and without verifying their purported benefit in performing those tasks. Conclusions and Relevance: The creation and use of LLMs in medicine need to be actively shaped by provisioning relevant training data, specifying the desired benefits, and evaluating the benefits via testing in real-world deployments.


Assuntos
Idioma , Aprendizado de Máquina , Registros Médicos , Medicina , Registros Médicos/normas , Medicina/métodos , Medicina/normas , Simulação por Computador
12.
JAMIA Open ; 6(2): ooad043, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37397506

RESUMO

Objective: Biases within probabilistic electronic phenotyping algorithms are largely unexplored. In this work, we characterize differences in subgroup performance of phenotyping algorithms for Alzheimer's disease and related dementias (ADRD) in older adults. Materials and methods: We created an experimental framework to characterize the performance of probabilistic phenotyping algorithms under different racial distributions allowing us to identify which algorithms may have differential performance, by how much, and under what conditions. We relied on rule-based phenotype definitions as reference to evaluate probabilistic phenotype algorithms created using the Automated PHenotype Routine for Observational Definition, Identification, Training and Evaluation framework. Results: We demonstrate that some algorithms have performance variations anywhere from 3% to 30% for different populations, even when not using race as an input variable. We show that while performance differences in subgroups are not present for all phenotypes, they do affect some phenotypes and groups more disproportionately than others. Discussion: Our analysis establishes the need for a robust evaluation framework for subgroup differences. The underlying patient populations for the algorithms showing subgroup performance differences have great variance between model features when compared with the phenotypes with little to no differences. Conclusion: We have created a framework to identify systematic differences in the performance of probabilistic phenotyping algorithms specifically in the context of ADRD as a use case. Differences in subgroup performance of probabilistic phenotyping algorithms are not widespread nor do they occur consistently. This highlights the great need for careful ongoing monitoring to evaluate, measure, and try to mitigate such differences.

13.
NPJ Digit Med ; 6(1): 135, 2023 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-37516790

RESUMO

The success of foundation models such as ChatGPT and AlphaFold has spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models' capabilities. In this narrative review, we examine 84 foundation models trained on non-imaging EMR data (i.e., clinical text and/or structured data) and create a taxonomy delineating their architectures, training data, and potential use cases. We find that most models are trained on small, narrowly-scoped clinical datasets (e.g., MIMIC-III) or broad, public biomedical corpora (e.g., PubMed) and are evaluated on tasks that do not provide meaningful insights on their usefulness to health systems. Considering these findings, we propose an improved evaluation framework for measuring the benefits of clinical foundation models that is more closely grounded to metrics that matter in healthcare.

15.
J Am Med Inform Assoc ; 30(9): 1532-1542, 2023 08 18.
Artigo em Inglês | MEDLINE | ID: mdl-37369008

RESUMO

OBJECTIVE: Heatlhcare institutions are establishing frameworks to govern and promote the implementation of accurate, actionable, and reliable machine learning models that integrate with clinical workflow. Such governance frameworks require an accompanying technical framework to deploy models in a resource efficient, safe and high-quality manner. Here we present DEPLOYR, a technical framework for enabling real-time deployment and monitoring of researcher-created models into a widely used electronic medical record system. MATERIALS AND METHODS: We discuss core functionality and design decisions, including mechanisms to trigger inference based on actions within electronic medical record software, modules that collect real-time data to make inferences, mechanisms that close-the-loop by displaying inferences back to end-users within their workflow, monitoring modules that track performance of deployed models over time, silent deployment capabilities, and mechanisms to prospectively evaluate a deployed model's impact. RESULTS: We demonstrate the use of DEPLOYR by silently deploying and prospectively evaluating 12 machine learning models trained using electronic medical record data that predict laboratory diagnostic results, triggered by clinician button-clicks in Stanford Health Care's electronic medical record. DISCUSSION: Our study highlights the need and feasibility for such silent deployment, because prospectively measured performance varies from retrospective estimates. When possible, we recommend using prospectively estimated performance measures during silent trials to make final go decisions for model deployment. CONCLUSION: Machine learning applications in healthcare are extensively researched, but successful translations to the bedside are rare. By describing DEPLOYR, we aim to inform machine learning deployment best practices and help bridge the model implementation gap.


Assuntos
Registros Eletrônicos de Saúde , Software , Estudos Retrospectivos , Aprendizado de Máquina
16.
Drug Saf ; 46(8): 725-742, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37340238

RESUMO

INTRODUCTION: Pharmacovigilance programs protect patient health and safety by identifying adverse event signals through postmarketing surveillance of claims data and spontaneous reports. Electronic health records (EHRs) provide new opportunities to address limitations of traditional approaches and promote discovery-oriented pharmacovigilance. METHODS: To evaluate the current state of EHR-based medication safety signal identification, we conducted a scoping literature review of studies aimed at identifying safety signals from routinely collected patient-level EHR data. We extracted information on study design, EHR data elements utilized, analytic methods employed, drugs and outcomes evaluated, and key statistical and data analysis choices. RESULTS: We identified 81 eligible studies. Disproportionality methods were the predominant analytic approach, followed by data mining and regression. Variability in study design makes direct comparisons difficult. Studies varied widely in terms of data, confounding adjustment, and statistical considerations. CONCLUSION: Despite broad interest in utilizing EHRs for safety signal identification, current efforts fail to leverage the full breadth and depth of available data or to rigorously control for confounding. The development of best practices and application of common data models would promote the expansion of EHR-based pharmacovigilance.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Registros Eletrônicos de Saúde , Humanos , Farmacovigilância , Mineração de Dados
17.
EClinicalMedicine ; 58: 101932, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37034358

RESUMO

Background: Adverse events of special interest (AESIs) were pre-specified to be monitored for the COVID-19 vaccines. Some AESIs are not only associated with the vaccines, but with COVID-19. Our aim was to characterise the incidence rates of AESIs following SARS-CoV-2 infection in patients and compare these to historical rates in the general population. Methods: A multi-national cohort study with data from primary care, electronic health records, and insurance claims mapped to a common data model. This study's evidence was collected between Jan 1, 2017 and the conclusion of each database (which ranged from Jul 2020 to May 2022). The 16 pre-specified prevalent AESIs were: acute myocardial infarction, anaphylaxis, appendicitis, Bell's palsy, deep vein thrombosis, disseminated intravascular coagulation, encephalomyelitis, Guillain- Barré syndrome, haemorrhagic stroke, non-haemorrhagic stroke, immune thrombocytopenia, myocarditis/pericarditis, narcolepsy, pulmonary embolism, transverse myelitis, and thrombosis with thrombocytopenia. Age-sex standardised incidence rate ratios (SIR) were estimated to compare post-COVID-19 to pre-pandemic rates in each of the databases. Findings: Substantial heterogeneity by age was seen for AESI rates, with some clearly increasing with age but others following the opposite trend. Similarly, differences were also observed across databases for same health outcome and age-sex strata. All studied AESIs appeared consistently more common in the post-COVID-19 compared to the historical cohorts, with related meta-analytic SIRs ranging from 1.32 (1.05 to 1.66) for narcolepsy to 11.70 (10.10 to 13.70) for pulmonary embolism. Interpretation: Our findings suggest all AESIs are more common after COVID-19 than in the general population. Thromboembolic events were particularly common, and over 10-fold more so. More research is needed to contextualise post-COVID-19 complications in the longer term. Funding: None.

18.
J Am Med Inform Assoc ; 30(5): 878-887, 2023 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-36795076

RESUMO

OBJECTIVE: There are over 363 customized risk models of the American College of Cardiology and the American Heart Association (ACC/AHA) pooled cohort equations (PCE) in the literature, but their gains in clinical utility are rarely evaluated. We build new risk models for patients with specific comorbidities and geographic locations and evaluate whether performance improvements translate to gains in clinical utility. MATERIALS AND METHODS: We retrain a baseline PCE using the ACC/AHA PCE variables and revise it to incorporate subject-level information of geographic location and 2 comorbidity conditions. We apply fixed effects, random effects, and extreme gradient boosting (XGB) models to handle the correlation and heterogeneity induced by locations. Models are trained using 2 464 522 claims records from Optum©'s Clinformatics® Data Mart and validated in the hold-out set (N = 1 056 224). We evaluate models' performance overall and across subgroups defined by the presence or absence of chronic kidney disease (CKD) or rheumatoid arthritis (RA) and geographic locations. We evaluate models' expected utility using net benefit and models' statistical properties using several discrimination and calibration metrics. RESULTS: The revised fixed effects and XGB models yielded improved discrimination, compared to baseline PCE, overall and in all comorbidity subgroups. XGB improved calibration for the subgroups with CKD or RA. However, the gains in net benefit are negligible, especially under low exchange rates. CONCLUSIONS: Common approaches to revising risk calculators incorporating extra information or applying flexible models may enhance statistical performance; however, such improvement does not necessarily translate to higher clinical utility. Thus, we recommend future works to quantify the consequences of using risk calculators to guide clinical decisions.


Assuntos
Artrite Reumatoide , Aterosclerose , Insuficiência Renal Crônica , Humanos , Doenças Cardiovasculares/epidemiologia , Comorbidade , Medição de Risco , Fatores de Risco , Estados Unidos , Aterosclerose/epidemiologia
19.
J Biomed Inform ; 139: 104319, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36791900

RESUMO

Despite the creation of thousands of machine learning (ML) models, the promise of improving patient care with ML remains largely unrealized. Adoption into clinical practice is lagging, in large part due to disconnects between how ML practitioners evaluate models and what is required for their successful integration into care delivery. Models are just one component of care delivery workflows whose constraints determine clinicians' abilities to act on models' outputs. However, methods to evaluate the usefulness of models in the context of their corresponding workflows are currently limited. To bridge this gap we developed APLUS, a reusable framework for quantitatively assessing via simulation the utility gained from integrating a model into a clinical workflow. We describe the APLUS simulation engine and workflow specification language, and apply it to evaluate a novel ML-based screening pathway for detecting peripheral artery disease at Stanford Health Care.


Assuntos
Atenção à Saúde , Aprendizado de Máquina , Humanos , Simulação por Computador , Fluxo de Trabalho , Idioma
20.
J Am Med Inform Assoc ; 30(4): 668-673, 2023 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-36810659

RESUMO

OBJECTIVE: The objective of this study is to provide a method to calculate model performance measures in the presence of resource constraints, with a focus on net benefit (NB). MATERIALS AND METHODS: To quantify a model's clinical utility, the Equator Network's TRIPOD guidelines recommend the calculation of the NB, which reflects whether the benefits conferred by intervening on true positives outweigh the harms conferred by intervening on false positives. We refer to the NB achievable in the presence of resource constraints as the realized net benefit (RNB), and provide formulae for calculating the RNB. RESULTS: Using 4 case studies, we demonstrate the degree to which an absolute constraint (eg, only 3 available intensive care unit [ICU] beds) diminishes the RNB of a hypothetical ICU admission model. We show how the introduction of a relative constraint (eg, surgical beds that can be converted to ICU beds for very high-risk patients) allows us to recoup some of the RNB but with a higher penalty for false positives. DISCUSSION: RNB can be calculated in silico before the model's output is used to guide care. Accounting for the constraint changes the optimal strategy for ICU bed allocation. CONCLUSIONS: This study provides a method to account for resource constraints when planning model-based interventions, either to avoid implementations where constraints are expected to play a larger role or to design more creative solutions (eg, converted ICU beds) to overcome absolute constraints when possible.


Assuntos
Hospitalização , Unidades de Terapia Intensiva , Humanos , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...